-
Notifications
You must be signed in to change notification settings - Fork 20.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/evm, eth/tracers: refactor structlogger + make it streamable #30806
Conversation
3990d1c
to
c2da651
Compare
eth/tracers/logger/logger.go
Outdated
// create a log | ||
if l.writer == nil { | ||
// Non-streaming, need to copy slices. | ||
log.Memory = slices.Clone(log.Memory) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if it makes more sense to write them to a buffer in this case and flush them at the end if we want to support the non-streaming case. I don't see a case in the codebase where we use the structlogs directly, only ever in a formatted/textified form
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The buffering is really due to them being included as items in a json-list, in the rpc response.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean shouldn't we marshal them here already into a buffer and not store them as individual logs, but I guess that will break external projects using the struct logger
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We kind of could, but then we would have to redefine the other rpc message to not double-encode the already encoded snippets. It would allow us to have a better estimation of how much data we are accumulating, so instead of limiting for n
number of items, we could limit for a certain number of MB.
Is there any other reason for you to suggest this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, basically these reasons. Will also limit the peak amount of memory, since we can free up the un-encoded data quickly and don't have to keep it around until the end of the encoding. Average memory consumption will go slightly up though (encoded bigger then not encoded)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What you are suggesting makes sense though, because it simplifies a whole lot.
I have now pushed some changes in the non-streaming encoder, which is used when returning "legacy" traces in the debug api. To ensure that nothing substantial changed, this is how I tested it:
The expected response is
The only diff between this PR and master is that this omits empty 9d8
< memory: [],
17d15
< memory: [], The early encoder makes it so we don't have to copy slices explicitly, we just encode a |
6142b72
to
84783af
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but I think @s1na should also take a look
eth/tracers/tracers_test.go
Outdated
// Create a tracer which records the number of steps | ||
var steps = 0 | ||
tracer := &tracing.Hooks{ | ||
OnOpcode: func(pc uint64, op byte, gas, cost uint64, scope tracing.OpContext, rData []byte, depth int, err error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like at this point we are benching EVM execution :) it seems this benchmark was introduced in #23016 to test optimizations to the struct logger. What do you think about dropping it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sgtm!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fixed it instead in the latest commit. It's horribly much slower though, due to the json-encoding and extra mem usage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You were not kidding. It's 10x slower.
PR:
goos: darwin
goarch: arm64
pkg: github.com/ethereum/go-ethereum/eth/tracers
BenchmarkTransactionTraceV2-11 15 74619664 ns/op 87275606 B/op 897587 allocs/op
PASS
Master:
goos: darwin
goarch: arm64
pkg: github.com/ethereum/go-ethereum/eth/tracers
BenchmarkTransactionTrace-11 163 6710361 ns/op 3798259 B/op 81648 allocs/op
PASS
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I compared the speed on mainnet txes and didn't notice a significant difference. The bench above might be a fluke because we moved the tracer setup now to the benched portion of the test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not a "fluke"
due to the json-encoding
Previously, the bench collected a bunch of objects. Now we json-encode them. Thus, encoding overhead, plus the fact that memory size of json-encoded data over native size.
In real life, we would have json-encoded later instead. So it doesn't matter a whole lot.
eth/tracers/logger/logger.go
Outdated
|
||
// toLegacyJSON converts the structlog to json-encoded legacy form (StructLogRes). | ||
// | ||
// The differences between the structlog json and the 'legacy' json are: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where do we use the non-legacy json format? I can't seem to find it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my comment above, with debug tracecall
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
go run ./cmd/geth --dev console --exec 'debug.traceCall({input: "0x600051600155602051600255"},"latest" , {"enableMemory": true})'
d2af996
to
f18d591
Compare
The updated benchmark is a lot slower than previously, since previously the json-encoding was deferred to later. Also, storing the data as json-encoded strings is larger than the raw bytes. The BenchmarkTransactionTraceV2 is thus renamed from the original
f18d591
to
cd66f17
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM.
// check if already accumulated the specified number of logs | ||
if l.cfg.Limit != 0 && l.cfg.Limit <= len(l.logs) { | ||
// check if already accumulated the size of the response. | ||
if l.cfg.Limit != 0 && l.resultSize > l.cfg.Limit { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Originally, the Config.Limit
refers to the maximum length of output, and it's changed to the maximum size of output with this change.
We should probably highlight the config change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. That config is a bit wonky. Was meaning to bring that up. The config is set by the caller. So the caller can say "Please don't feed me 1GB of data". But now that we're operating in bytes
, we could also ensure that we don't try to deliver more than the configure http response size limit, for example. We have such a thing, right @fjl ?
@s1na @holiman we have multiple logger implementations, e.g. StructLog, JsonLog, MarkdownLog. What I can tell is that the major difference is the output format. Do you guys think it makes sense to refactor the logger in the future that: We have a common logger implementation with a config, callers can configure the types of information they want, and also the output format they choose. We can only maintain a single logger implementation in this way maybe? |
I mean, we're kind of going in that direction. This PR streamlines the loggers a bit, but, the structlogger is still the odd one out, due to the buffering capability. If we could fully lift that out, then we'd be even closer. |
This PR refactors the structlog a bit, making it so that it can be used in a streaming mode.
OBS: this PR makes a change in the input
config
config, the third input-parem field todebug.traceCall
. Previously, seteting it to e.g.{"enableMemory": true, "limit": 1024}
would mean that the response was limited to1024
items. Since an 'item' may include both memory and storage, the actual size of the response was undertermined.After this change, the response will be limited to
1024
bytes
(or thereabouts).The commandline usage of structlog now uses the streaming mode, leaving the non-streaming mode of operation for the eth_Call.
There are two benefits of streaming mode
Before:
After